Generating new fonts is a time-consuming and labor-intensive, especially in a language with a huge amount of characters like Chinese. Various deep learning models have demonstrated the ability to efficiently generate new fonts with a few reference characters of that style. This project aims to develop a few-shot cross-lingual font generator based on AGIS-Net and improve the performance metrics mentioned. Our approaches include redesigning the encoder and the loss function. We will validate our method on multiple languages and datasets mentioned.
translated by 谷歌翻译
美国刑事法律体系越来越依赖软件输出来定罪和被监禁。在每年大量案件中,政府根据统计软件的证据(例如概率基因分型,环境音频检测和工具标志分析工具)做出这些结果决定,以使辩护律师无法完全盘中或审查。这破坏了对抗性刑事法律制度的承诺,该制度依赖于辩方探查和测试起诉案件保护个人权利的能力。为了应对这种软件的对抗性审查输出的需求,我们提出了强大的对抗测试作为审计框架,以检查证据统计软件的有效性。我们通过在强大的机器学习和算法公平的最新作品中绘制大量工作来定义和操作这种强大的对抗性测试的概念。我们演示了该框架如何使审查此类工具的过程标准化,并使辩护律师能够检查其与当前案件最相关的情况的有效性。我们进一步讨论了美国刑事法律制度内的现有结构和机构挑战,该系统可能会造成实施该和其他此类审计框架的障碍,并通过讨论政策变更的讨论可以帮助解决这些问题。
translated by 谷歌翻译
在医疗保健系统中,需要患者使用可穿戴设备进行远程数据收集和对健康数据的实时监控以及健康状况的状态。可穿戴设备的这种采用导致收集和传输的数据量显着增加。由于设备由较小的电池电源运行,因此由于设备的高处理要求以进行数据收集和传输,因此可以快速减少它们。鉴于医疗数据的重要性,必须所有传输数据遵守严格的完整性和可用性要求。减少医疗保健数据的量和传输频率将通过使用推理算法改善设备电池寿命。有一个以准确性和效率改善传输指标的问题,彼此之间的权衡,例如提高准确性会降低效率。本文表明,机器学习可用于分析复杂的健康数据指标,例如数据传输的准确性和效率,以使用Levenberg-Marquardt算法来克服权衡问题,从而增强这两个指标,从而通过少较少的样本来传输,同时保持维护准确性。使用标准心率数据集测试该算法以比较指标。结果表明,LMA最好以3.33倍的效率进行样本数据尺寸和79.17%的精度,在7种不同的采样案例中具有相似的准确性,用于测试,但表明效率提高。与具有高效率的现有方法相比,这些提出的方法使用机器学习可以显着改善两个指标,而无需牺牲其他指标。
translated by 谷歌翻译
Optical coherence tomography (OCT) captures cross-sectional data and is used for the screening, monitoring, and treatment planning of retinal diseases. Technological developments to increase the speed of acquisition often results in systems with a narrower spectral bandwidth, and hence a lower axial resolution. Traditionally, image-processing-based techniques have been utilized to reconstruct subsampled OCT data and more recently, deep-learning-based methods have been explored. In this study, we simulate reduced axial scan (A-scan) resolution by Gaussian windowing in the spectral domain and investigate the use of a learning-based approach for image feature reconstruction. In anticipation of the reduced resolution that accompanies wide-field OCT systems, we build upon super-resolution techniques to explore methods to better aid clinicians in their decision-making to improve patient outcomes, by reconstructing lost features using a pixel-to-pixel approach with an altered super-resolution generative adversarial network (SRGAN) architecture.
translated by 谷歌翻译
We introduce a new tool for stochastic convex optimization (SCO): a Reweighted Stochastic Query (ReSQue) estimator for the gradient of a function convolved with a (Gaussian) probability density. Combining ReSQue with recent advances in ball oracle acceleration [CJJJLST20, ACJJS21], we develop algorithms achieving state-of-the-art complexities for SCO in parallel and private settings. For a SCO objective constrained to the unit ball in $\mathbb{R}^d$, we obtain the following results (up to polylogarithmic factors). We give a parallel algorithm obtaining optimization error $\epsilon_{\text{opt}}$ with $d^{1/3}\epsilon_{\text{opt}}^{-2/3}$ gradient oracle query depth and $d^{1/3}\epsilon_{\text{opt}}^{-2/3} + \epsilon_{\text{opt}}^{-2}$ gradient queries in total, assuming access to a bounded-variance stochastic gradient estimator. For $\epsilon_{\text{opt}} \in [d^{-1}, d^{-1/4}]$, our algorithm matches the state-of-the-art oracle depth of [BJLLS19] while maintaining the optimal total work of stochastic gradient descent. We give an $(\epsilon_{\text{dp}}, \delta)$-differentially private algorithm which, given $n$ samples of Lipschitz loss functions, obtains near-optimal optimization error and makes $\min(n, n^2\epsilon_{\text{dp}}^2 d^{-1}) + \min(n^{4/3}\epsilon_{\text{dp}}^{1/3}, (nd)^{2/3}\epsilon_{\text{dp}}^{-1})$ queries to the gradients of these functions. In the regime $d \le n \epsilon_{\text{dp}}^{2}$, where privacy comes at no cost in terms of the optimal loss up to constants, our algorithm uses $n + (nd)^{2/3}\epsilon_{\text{dp}}^{-1}$ queries and improves recent advancements of [KLL21, AFKT21]. In the moderately low-dimensional setting $d \le \sqrt n \epsilon_{\text{dp}}^{3/2}$, our query complexity is near-linear.
translated by 谷歌翻译
Cashews are grown by over 3 million smallholders in more than 40 countries worldwide as a principal source of income. As the third largest cashew producer in Africa, Benin has nearly 200,000 smallholder cashew growers contributing 15% of the country's national export earnings. However, a lack of information on where and how cashew trees grow across the country hinders decision-making that could support increased cashew production and poverty alleviation. By leveraging 2.4-m Planet Basemaps and 0.5-m aerial imagery, newly developed deep learning algorithms, and large-scale ground truth datasets, we successfully produced the first national map of cashew in Benin and characterized the expansion of cashew plantations between 2015 and 2021. In particular, we developed a SpatioTemporal Classification with Attention (STCA) model to map the distribution of cashew plantations, which can fully capture texture information from discriminative time steps during a growing season. We further developed a Clustering Augmented Self-supervised Temporal Classification (CASTC) model to distinguish high-density versus low-density cashew plantations by automatic feature extraction and optimized clustering. Results show that the STCA model has an overall accuracy of 80% and the CASTC model achieved an overall accuracy of 77.9%. We found that the cashew area in Benin has doubled from 2015 to 2021 with 60% of new plantation development coming from cropland or fallow land, while encroachment of cashew plantations into protected areas has increased by 70%. Only half of cashew plantations were high-density in 2021, suggesting high potential for intensification. Our study illustrates the power of combining high-resolution remote sensing imagery and state-of-the-art deep learning algorithms to better understand tree crops in the heterogeneous smallholder landscape.
translated by 谷歌翻译
There are multiple scales of abstraction from which we can describe the same image, depending on whether we are focusing on fine-grained details or a more global attribute of the image. In brain mapping, learning to automatically parse images to build representations of both small-scale features (e.g., the presence of cells or blood vessels) and global properties of an image (e.g., which brain region the image comes from) is a crucial and open challenge. However, most existing datasets and benchmarks for neuroanatomy consider only a single downstream task at a time. To bridge this gap, we introduce a new dataset, annotations, and multiple downstream tasks that provide diverse ways to readout information about brain structure and architecture from the same image. Our multi-task neuroimaging benchmark (MTNeuro) is built on volumetric, micrometer-resolution X-ray microtomography images spanning a large thalamocortical section of mouse brain, encompassing multiple cortical and subcortical regions. We generated a number of different prediction challenges and evaluated several supervised and self-supervised models for brain-region prediction and pixel-level semantic segmentation of microstructures. Our experiments not only highlight the rich heterogeneity of this dataset, but also provide insights into how self-supervised approaches can be used to learn representations that capture multiple attributes of a single image and perform well on a variety of downstream tasks. Datasets, code, and pre-trained baseline models are provided at: https://mtneuro.github.io/ .
translated by 谷歌翻译
In this paper, we develop an efficient multi-scale network to predict action classes in partial videos in an end-to-end manner. Unlike most existing methods with offline feature generation, our method directly takes frames as input and further models motion evolution on two different temporal scales.Therefore, we solve the complexity problems of the two stages of modeling and the problem of insufficient temporal and spatial information of a single scale. Our proposed End-to-End MultiScale Network (E2EMSNet) is composed of two scales which are named segment scale and observed global scale. The segment scale leverages temporal difference over consecutive frames for finer motion patterns by supplying 2D convolutions. For observed global scale, a Long Short-Term Memory (LSTM) is incorporated to capture motion features of observed frames. Our model provides a simple and efficient modeling framework with a small computational cost. Our E2EMSNet is evaluated on three challenging datasets: BIT, HMDB51, and UCF101. The extensive experiments demonstrate the effectiveness of our method for action prediction in videos.
translated by 谷歌翻译
Due to the environmental impacts caused by the construction industry, repurposing existing buildings and making them more energy-efficient has become a high-priority issue. However, a legitimate concern of land developers is associated with the buildings' state of conservation. For that reason, infrared thermography has been used as a powerful tool to characterize these buildings' state of conservation by detecting pathologies, such as cracks and humidity. Thermal cameras detect the radiation emitted by any material and translate it into temperature-color-coded images. Abnormal temperature changes may indicate the presence of pathologies, however, reading thermal images might not be quite simple. This research project aims to combine infrared thermography and machine learning (ML) to help stakeholders determine the viability of reusing existing buildings by identifying their pathologies and defects more efficiently and accurately. In this particular phase of this research project, we've used an image classification machine learning model of Convolutional Neural Networks (DCNN) to differentiate three levels of cracks in one particular building. The model's accuracy was compared between the MSX and thermal images acquired from two distinct thermal cameras and fused images (formed through multisource information) to test the influence of the input data and network on the detection results.
translated by 谷歌翻译
Off-policy evaluation (OPE) is a method for estimating the return of a target policy using some pre-collected observational data generated by a potentially different behavior policy. In some cases, there may be unmeasured variables that can confound the action-reward or action-next-state relationships, rendering many existing OPE approaches ineffective. This paper develops an instrumental variable (IV)-based method for consistent OPE in confounded Markov decision processes (MDPs). Similar to single-stage decision making, we show that IV enables us to correctly identify the target policy's value in infinite horizon settings as well. Furthermore, we propose an efficient and robust value estimator and illustrate its effectiveness through extensive simulations and analysis of real data from a world-leading short-video platform.
translated by 谷歌翻译